/*
MERGING CLEANED JOBB DATA WITH MERGED FIRM/INDIVIUAL DATA

Data 	: A1_flagged_varsel_firmperson.dta ; A1_tenure.dta
Folder 	: 
Date	: 2018-01-24

Creator		: Jonas Cederlof	(JC)
Description :	
Notes 		:

LATSEST UPDATE : 2018-02-07
*/

********************************************************************************
clear
set more		 off
cap   log close 	_all

log using 	"../log/A2_merged_jobb_firm_pers.log" 	, replace 
use   		"$datapath/A1_cleaned_RAMS.dta" 




*Generate a variable to merge on. Basically is is the n:th time an indiviual is 
*displaced (which means nothing in this dataset but in the varseldata which is
*merged on below).
bys persid firmid (date) : gen temp= _n 	
// this temp variable goes far beoynd all possible number of dispalcements, hence 
// no worry that _merge==2 appears due to temp value missing in the master data.

* Mering on persid, firmid and year/month (date) specific observations from the 
* Jobb register onto a notification of displacement which was registered in a 
* specific firm/date/persid.
merge m:1 persid firmid temp  	using "$datapath/A1_varseldata.dta"

* Note: _merge==2 8,235 individuals for 2 reasons)
*
*	i) People that have _merge==2 are not registered working at the firm, EVER!
*	ii) In the varseldata there are varsels with no firmid. (295 obs)


tab varselorsak if _merge==2	 // a majority comes from bankrupcties
drop if _merge==2
drop 	_merge


{ // Expand data (generate unique individual panels for each notification)
*===============================================================================
*Note: 	As individuals can get notified several times (maximum 10 in the data)
*	I create a uniqe panel for each displacement. Each panel has a uniqe
*	idetification number called "lopnr".

*Generate: Variable indicating a notification
gen temp2 = 1 if varselid!=. 
*Generate: Number of the the n:th displacement for indiviual 
bys persid (date firmid) : gen nrdisp = sum(temp2) if varselid!=.
*Generate: Total number of displacements for indivual (same as (==max(nrdisp))
bys persid (date) : egen totalnrdisp= total(temp2) // 88% are notified only once


*Note the zeros! Those with totalnrdisp==0 are indivuals in the A0_persid_varsel.dta
*who lacks firmid in the A1_varseldata.dta or are dropped in the cleaning of the 
*varseldata. These indivuals will be naturally dropped as they have no notification
* at all or lacks one that is firm specific.
tab totalnrdisp
drop if totalnrdisp==0

*Dropping temporary variable
drop temp temp2


*Expand by the number of displacements to create a uniqe panel for each one
expand totalnrdisp	

*Generate: Variable indicating to which displacment the expanded observation 
*	   belongs to. Since date was uniqe before but is not now after expasion
*	   this just generates such an indicator
bys persid date firmid: gen dispid = _n
tab dispid // note correspondace it tab totalnrdisp above

*Generate: ID-variable for the indiviual panel
egen long lopnr = group(persid dispid)
lab var lopnr 	"ID: Panel ID, combination of indivual and notification"
}
*
{ // Tsfill data
*===============================================================================
*Note: 	The RAMS data contains gaps. As we would like to set annual earnings to
*	zero for times when an indiviual is not in the register I need to fill
*	this time gaps. The panel has repated dates due to indivuals having mul-
*	tiple employers in the same month, hence the somewhat tedious way of 
*	filling in these observations. 

preserve
	keep work lopnr date
	gcollapse (count) work, by(lopnr date)
	
	*Tsfill
	xtset lopnr date
	tsfill 
	
	drop work
	tempfile timefill
	save `timefill'
restore


*Merge on dates
merge m:1 lopnr date 	using `timefill'

*generate indicator for tsfilled dates
gen tsfilled = _merge==2
drop _merge

*Expanding all panels to 2019m12 (where the RAMS data ends)
bys lopnr (date): gen exp = ym(2018,12) - date + 1 if  _n==_N

expand exp if exp!=. ,gen(settomissing)


*Fix the expanded observations

*Replace date to correct date
bys lopnr settomissing (date) : gen nvals = _n if settomissing!=.
replace date = date + nvals if settomissing==1
replace tsfilled =1 if settomissing==1

*Set other variables in expanded observations to missing
foreach var of varlist 	 firmid year month lonfink work plantid  yrkstallnku {
	
	replace `var' =. if settomissing==1
}
foreach var of varlist  astsni* {
	replace `var' = "" if settomissing==1
}	

drop nvals settomissing exp

*Filling in variabels that turn out missing from the merge
bys lopnr (date) : replace persid = persid[_n-1] if persid==.

*Filling in missing year and month 
gen 	dofm  = dofm(date) 	
replace year  = year(dofm) 	if year==.
replace month = month(dofm) 	if month==.
drop dofm

}
*	
{ // Copy notification data info within each lopnr
*===============================================================================
*Replace to missing those notifications that do not belong to the panel
*I.e for workers with multiple notifications. Each notificatoin/worker panel
*has a uniqe notification information. We now the specific information by 
*nrdisp==dispid
foreach var of varlist 	varselid - nottime_def  {
	replace `var'=. if nrdisp!=dispid & dispid!=.
}

replace nrdisp=. if nrdisp!=dispid


*Replacing all missing varsel information within a lopnr
preserve
	
	gcollapse (max) 	varselid-nottime_def  , by(lopnr) fast


	tempfile  varselinfo
	save 	 `varselinfo'
restore

merge m:1 lopnr 	using `varselinfo', update nogen

}
*
{ // Generate eventtime
*===============================================================================
*Unique event time for each panel/notification

bys lopnr (date) : gen eventtime_def = date - notdate_def
}
*

order persid lopnr date eventtime_def  year month firmid work


*Save data (uniqe at lopnr firmid date [i.e persid dispnr firmid date])
compress
save "$datapath/A2_merged_RAMS_varsel.dta", replace


